Overview

Dataset statistics

Number of variables20
Number of observations8145
Missing cells0
Missing cells (%)0.0%
Duplicate rows4549
Duplicate rows (%)55.9%
Total size in memory1.2 MiB
Average record size in memory160.0 B

Variable types

NUM9
BOOL6
CAT5

Reproduction

Analysis started2020-08-25 01:03:55.435801
Analysis finished2020-08-25 01:04:09.740014
Duration14.3 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

veil-type has constant value "0" Constant
Dataset has 4549 (55.9%) duplicate rows Duplicates
cap-shape has 454 (5.6%) zeros Zeros
stalk-color-above-ring has 432 (5.3%) zeros Zeros
gill-color has 1728 (21.2%) zeros Zeros
population has 385 (4.7%) zeros Zeros
odor has 407 (5.0%) zeros Zeros
ring-type has 2780 (34.1%) zeros Zeros
cap-color has 168 (2.1%) zeros Zeros
habitat has 3151 (38.7%) zeros Zeros
stalk-root has 2480 (30.4%) zeros Zeros

Variables

cap-shape
Real number (ℝ≥0)

ZEROS

Distinct count6
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.3484346224677717
Minimum0
Maximum5
Zeros454
Zeros (%)5.6%
Memory size63.8 KiB
2020-08-25T01:04:09.788463image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median3
Q35
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.604770276
Coefficient of variation (CV)0.4792598505
Kurtosis-1.242066847
Mean3.348434622
Median Absolute Deviation (MAD)2
Skewness-0.2480877888
Sum27273
Variance2.57528764
2020-08-25T01:04:09.894551image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
5366745.0%
 
2315938.8%
 
382810.2%
 
04545.6%
 
4330.4%
 
14< 0.1%
 
ValueCountFrequency (%) 
04545.6%
 
14< 0.1%
 
2315938.8%
 
382810.2%
 
4330.4%
 
5366745.0%
 
ValueCountFrequency (%) 
5366745.0%
 
4330.4%
 
382810.2%
 
2315938.8%
 
14< 0.1%
 
04545.6%
 

stalk-color-above-ring
Real number (ℝ≥0)

ZEROS

Distinct count9
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.8193984039287905
Minimum0
Maximum8
Zeros432
Zeros (%)5.3%
Memory size63.8 KiB
2020-08-25T01:04:10.001664image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q16
median7
Q37
95-th percentile7
Maximum8
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.900242173
Coefficient of variation (CV)0.3265358446
Kurtosis2.517571226
Mean5.819398404
Median Absolute Deviation (MAD)0
Skewness-1.839249759
Sum47399
Variance3.610920316
2020-08-25T01:04:10.118013image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
7448555.1%
 
6187223.0%
 
35767.1%
 
44485.5%
 
04325.3%
 
51922.4%
 
2961.2%
 
1360.4%
 
880.1%
 
ValueCountFrequency (%) 
04325.3%
 
1360.4%
 
2961.2%
 
35767.1%
 
44485.5%
 
51922.4%
 
6187223.0%
 
7448555.1%
 
880.1%
 
ValueCountFrequency (%) 
880.1%
 
7448555.1%
 
6187223.0%
 
51922.4%
 
44485.5%
 
35767.1%
 
2961.2%
 
1360.4%
 
04325.3%
 

gill-color
Real number (ℝ≥0)

ZEROS

Distinct count12
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.811663597298956
Minimum0
Maximum11
Zeros1728
Zeros (%)21.2%
Memory size63.8 KiB
2020-08-25T01:04:10.229450image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median5
Q37
95-th percentile10
Maximum11
Range11
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.537566635
Coefficient of variation (CV)0.7352065587
Kurtosis-1.284274143
Mean4.811663597
Median Absolute Deviation (MAD)3
Skewness0.06091970843
Sum39191
Variance12.5143777
2020-08-25T01:04:10.327193image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0172821.2%
 
7150018.4%
 
10120314.8%
 
5105212.9%
 
27569.3%
 
37339.0%
 
94926.0%
 
44115.0%
 
1961.2%
 
11861.1%
 
6640.8%
 
8240.3%
 
ValueCountFrequency (%) 
0172821.2%
 
1961.2%
 
27569.3%
 
37339.0%
 
44115.0%
 
5105212.9%
 
6640.8%
 
7150018.4%
 
8240.3%
 
94926.0%
 
ValueCountFrequency (%) 
11861.1%
 
10120314.8%
 
94926.0%
 
8240.3%
 
7150018.4%
 
6640.8%
 
5105212.9%
 
44115.0%
 
37339.0%
 
27569.3%
 

cap-surface
Categorical

Distinct count4
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.8 KiB
3
3251
2
2562
0
2328
1
 
4
ValueCountFrequency (%) 
3325139.9%
 
2256231.5%
 
0232828.6%
 
14< 0.1%
 
2020-08-25T01:04:10.495235image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
3325139.9%
 
2256231.5%
 
0232828.6%
 
14< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number8145100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
3325139.9%
 
2256231.5%
 
0232828.6%
 
14< 0.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Common8145100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
3325139.9%
 
2256231.5%
 
0232828.6%
 
14< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8145100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
3325139.9%
 
2256231.5%
 
0232828.6%
 
14< 0.1%
 

veil-type
Boolean

CONSTANT
REJECTED

Distinct count1
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.8 KiB
0
8145
ValueCountFrequency (%) 
08145100.0%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.8 KiB
1
7935
0
 
210
ValueCountFrequency (%) 
1793597.4%
 
02102.6%
 

population
Real number (ℝ≥0)

ZEROS

Distinct count6
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.642971147943524
Minimum0
Maximum5
Zeros385
Zeros (%)4.7%
Memory size63.8 KiB
2020-08-25T01:04:10.610818image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median4
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.25170611
Coefficient of variation (CV)0.3435948458
Kurtosis1.675005546
Mean3.642971148
Median Absolute Deviation (MAD)1
Skewness-1.411394602
Sum29672
Variance1.566768185
2020-08-25T01:04:10.712560image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
4404549.7%
 
5171421.0%
 
3126015.5%
 
24014.9%
 
03854.7%
 
13404.2%
 
ValueCountFrequency (%) 
03854.7%
 
13404.2%
 
24014.9%
 
3126015.5%
 
4404549.7%
 
5171421.0%
 
ValueCountFrequency (%) 
5171421.0%
 
4404549.7%
 
3126015.5%
 
24014.9%
 
13404.2%
 
03854.7%
 
Distinct count4
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.8 KiB
2
5195
1
2372
0
 
554
3
 
24
ValueCountFrequency (%) 
2519563.8%
 
1237229.1%
 
05546.8%
 
3240.3%
 
2020-08-25T01:04:10.884357image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
2519563.8%
 
1237229.1%
 
05546.8%
 
3240.3%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number8145100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
2519563.8%
 
1237229.1%
 
05546.8%
 
3240.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common8145100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
2519563.8%
 
1237229.1%
 
05546.8%
 
3240.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8145100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
2519563.8%
 
1237229.1%
 
05546.8%
 
3240.3%
 

bruises?
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.8 KiB
0
4755
1
3390
ValueCountFrequency (%) 
0475558.4%
 
1339041.6%
 

odor
Real number (ℝ≥0)

ZEROS

Distinct count9
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.141313689379988
Minimum0
Maximum8
Zeros407
Zeros (%)5.0%
Memory size63.8 KiB
2020-08-25T01:04:10.993179image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median5
Q35
95-th percentile8
Maximum8
Range8
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.105002343
Coefficient of variation (CV)0.5082933825
Kurtosis-0.775168329
Mean4.141313689
Median Absolute Deviation (MAD)2
Skewness-0.08206752281
Sum33731
Variance4.431034865
2020-08-25T01:04:11.114214image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
5353543.4%
 
2216026.5%
 
75767.1%
 
85767.1%
 
04075.0%
 
34065.0%
 
62573.2%
 
11922.4%
 
4360.4%
 
ValueCountFrequency (%) 
04075.0%
 
11922.4%
 
2216026.5%
 
34065.0%
 
4360.4%
 
5353543.4%
 
62573.2%
 
75767.1%
 
85767.1%
 
ValueCountFrequency (%) 
85767.1%
 
75767.1%
 
62573.2%
 
5353543.4%
 
4360.4%
 
34065.0%
 
2216026.5%
 
11922.4%
 
04075.0%
 

ring-number
Categorical

Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.8 KiB
1
7509
2
 
600
0
 
36
ValueCountFrequency (%) 
1750992.2%
 
26007.4%
 
0360.4%
 
2020-08-25T01:04:11.448751image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
1750992.2%
 
26007.4%
 
0360.4%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number8145100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
1750992.2%
 
26007.4%
 
0360.4%
 

Most occurring scripts

ValueCountFrequency (%) 
Common8145100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
1750992.2%
 
26007.4%
 
0360.4%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8145100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
1750992.2%
 
26007.4%
 
0360.4%
 
Distinct count4
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.8 KiB
2
4953
1
2304
0
 
601
3
 
287
ValueCountFrequency (%) 
2495360.8%
 
1230428.3%
 
06017.4%
 
32873.5%
 
2020-08-25T01:04:11.620431image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
2495360.8%
 
1230428.3%
 
06017.4%
 
32873.5%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number8145100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
2495360.8%
 
1230428.3%
 
06017.4%
 
32873.5%
 

Most occurring scripts

ValueCountFrequency (%) 
Common8145100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
2495360.8%
 
1230428.3%
 
06017.4%
 
32873.5%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8145100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
2495360.8%
 
1230428.3%
 
06017.4%
 
32873.5%
 

ring-type
Real number (ℝ≥0)

ZEROS

Distinct count5
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.294413750767342
Minimum0
Maximum4
Zeros2780
Zeros (%)34.1%
Memory size63.8 KiB
2020-08-25T01:04:11.731897image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q34
95-th percentile4
Maximum4
Range4
Interquartile range (IQR)4

Descriptive statistics

Standard deviation1.801753533
Coefficient of variation (CV)0.7852783886
Kurtosis-1.707784602
Mean2.294413751
Median Absolute Deviation (MAD)2
Skewness-0.2925260816
Sum18688
Variance3.246315794
2020-08-25T01:04:11.840137image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
4398548.9%
 
0278034.1%
 
2129615.9%
 
1480.6%
 
3360.4%
 
ValueCountFrequency (%) 
0278034.1%
 
1480.6%
 
2129615.9%
 
3360.4%
 
4398548.9%
 
ValueCountFrequency (%) 
4398548.9%
 
3360.4%
 
2129615.9%
 
1480.6%
 
0278034.1%
 

veil-color
Categorical

Distinct count4
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.8 KiB
2
7945
1
 
96
0
 
96
3
 
8
ValueCountFrequency (%) 
2794597.5%
 
1961.2%
 
0961.2%
 
380.1%
 
2020-08-25T01:04:12.027500image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
2794597.5%
 
0961.2%
 
1961.2%
 
380.1%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number8145100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
2794597.5%
 
0961.2%
 
1961.2%
 
380.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Common8145100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
2794597.5%
 
0961.2%
 
1961.2%
 
380.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8145100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
2794597.5%
 
0961.2%
 
1961.2%
 
380.1%
 

cap-color
Real number (ℝ≥0)

ZEROS

Distinct count10
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.511479435236342
Minimum0
Maximum9
Zeros168
Zeros (%)2.1%
Memory size63.8 KiB
2020-08-25T01:04:12.140376image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q13
median4
Q38
95-th percentile9
Maximum9
Range9
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.548863741
Coefficient of variation (CV)0.5649729268
Kurtosis-0.846329753
Mean4.511479435
Median Absolute Deviation (MAD)1
Skewness0.7024968851
Sum36746
Variance6.496706369
2020-08-25T01:04:12.239827image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
4228728.1%
 
3184322.6%
 
2150018.4%
 
9108113.3%
 
8104612.8%
 
01682.1%
 
51441.8%
 
1440.5%
 
7160.2%
 
6160.2%
 
ValueCountFrequency (%) 
01682.1%
 
1440.5%
 
2150018.4%
 
3184322.6%
 
4228728.1%
 
51441.8%
 
6160.2%
 
7160.2%
 
8104612.8%
 
9108113.3%
 
ValueCountFrequency (%) 
9108113.3%
 
8104612.8%
 
7160.2%
 
6160.2%
 
51441.8%
 
4228728.1%
 
3184322.6%
 
2150018.4%
 
1440.5%
 
01682.1%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.8 KiB
1
4615
0
3530
ValueCountFrequency (%) 
1461556.7%
 
0353043.3%
 

habitat
Real number (ℝ≥0)

ZEROS

Distinct count7
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.510620012277471
Minimum0
Maximum6
Zeros3151
Zeros (%)38.7%
Memory size63.8 KiB
2020-08-25T01:04:12.340496image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile5
Maximum6
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.72028988
Coefficient of variation (CV)1.138797226
Kurtosis-0.2641483349
Mean1.510620012
Median Absolute Deviation (MAD)1
Skewness0.9830076938
Sum12304
Variance2.95939727
2020-08-25T01:04:12.461530image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0315138.7%
 
1215526.5%
 
4114614.1%
 
283210.2%
 
53714.6%
 
32983.7%
 
61922.4%
 
ValueCountFrequency (%) 
0315138.7%
 
1215526.5%
 
283210.2%
 
32983.7%
 
4114614.1%
 
53714.6%
 
61922.4%
 
ValueCountFrequency (%) 
61922.4%
 
53714.6%
 
4114614.1%
 
32983.7%
 
283210.2%
 
1215526.5%
 
0315138.7%
 

gill-size
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.8 KiB
0
5626
1
2519
ValueCountFrequency (%) 
0562669.1%
 
1251930.9%
 

stalk-root
Real number (ℝ≥0)

ZEROS

Distinct count5
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1134438305709025
Minimum0
Maximum4
Zeros2480
Zeros (%)30.4%
Memory size63.8 KiB
2020-08-25T01:04:12.575593image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile3
Maximum4
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.063156529
Coefficient of variation (CV)0.9548362475
Kurtosis0.07543954876
Mean1.113443831
Median Absolute Deviation (MAD)1
Skewness0.9430869608
Sum9069
Variance1.130301805
2020-08-25T01:04:12.684271image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1377946.4%
 
0248030.4%
 
3112813.8%
 
25636.9%
 
41952.4%
 
ValueCountFrequency (%) 
0248030.4%
 
1377946.4%
 
25636.9%
 
3112813.8%
 
41952.4%
 
ValueCountFrequency (%) 
41952.4%
 
3112813.8%
 
25636.9%
 
1377946.4%
 
0248030.4%
 

target
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.8 KiB
0
4228
1
3917
ValueCountFrequency (%) 
0422851.9%
 
1391748.1%
 

Interactions

2020-08-25T01:03:56.856554image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:56.983592image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:57.120779image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:57.246789image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:57.377179image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:57.512729image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:57.653801image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:57.783592image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:57.920907image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:58.068678image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:58.210254image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:58.364543image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:58.509677image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:58.816878image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:58.970569image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:59.134332image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:59.273891image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:59.430349image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:59.586234image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:59.715516image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:59.855397image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:03:59.988516image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:00.121565image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:00.258203image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:00.400386image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:00.528449image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:00.661521image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:00.811165image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:00.951746image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:01.095525image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:01.224823image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:01.355986image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:01.499765image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:01.639870image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:01.765480image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:01.899859image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:02.041197image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:02.183522image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:02.341366image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:02.487050image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:02.625646image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:02.774692image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:03.091238image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:03.230333image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:03.374451image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:03.531911image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:03.681431image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:03.842515image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:03.991378image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:04.134777image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:04.285839image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:04.443325image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:04.587487image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:04.734924image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:04.895230image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:05.022486image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:05.160733image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:05.285377image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:05.412678image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:05.547158image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:05.682931image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:05.803689image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:05.931943image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:06.075633image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:06.211671image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:06.355461image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:06.502572image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:06.637654image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:06.781447image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:06.928787image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:07.064698image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:07.364671image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:07.514854image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:07.666117image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:07.823997image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:07.971628image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:08.118344image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:08.271693image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:08.430852image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:08.575184image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:08.725437image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2020-08-25T01:04:12.841847image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-08-25T01:04:13.180612image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-08-25T01:04:13.522893image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-08-25T01:04:13.878326image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-08-25T01:04:14.172279image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-08-25T01:04:09.038753image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:04:09.541870image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Sample

First rows

cap-shapestalk-color-above-ringgill-colorcap-surfaceveil-typegill-attachmentpopulationstalk-surface-above-ringbruises?odorring-numberstalk-surface-below-ringring-typeveil-colorcap-colorstalk-shapehabitatgill-sizestalk-roottarget
057420132161242405131
157420122101242901020
207520122131242803020
357530132161242805131
457420102051202311030
557530122101242901020
607220122101242803020
707530132131242803020
857730142161242801131
907220132101242903020

Last rows

cap-shapestalk-color-above-ringgill-colorcap-surfaceveil-typegill-attachmentpopulationstalk-surface-above-ringbruises?odorring-numberstalk-surface-below-ringring-typeveil-colorcap-colorstalk-shapehabitatgill-sizestalk-roottarget
813507230132131242903020
813627200142051242305130
813757700142131242910110
813857220122101242903020
813927500152051242405130
814057300130051202311030
814157730132131342904040
814227720142131242810110
814357420132131242901020
814427730152101342904040

Duplicate rows

Most frequent

cap-shapestalk-color-above-ringgill-colorcap-surfaceveil-typegill-attachmentpopulationstalk-surface-above-ringbruises?odorring-numberstalk-surface-below-ringring-typeveil-colorcap-colorstalk-shapehabitatgill-sizestalk-roottargetcount
232235001421512422100106
233235001421512423100106
234235001421512424100106
235235001521512422100106
236235001521512423100106
237235001521512424100106
238235301421512422100106
239235301421512423100106
240235301421512424100106
241235301521512422100106